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Abstract 

We show that any distribution on {—1,+!}" that is A:- wise independent fools any 
halfspace h : { — 1,+!}" — > { — 1,+1} with error e for k = 0(log2(l/e)/e2). Up to 
logarithmic factors, our result matches a lower bound by Benjamini, Gurel-Gurevich, 
and Peled (2007) showing that k = r2(l/(e^-log(l/e))). Using standard constructions of 
/c-wise independent distributions, we obtain the first explicit pseudorandom generators 
G : {— 1,+1}* {—1,+!}"' that fool halfspaces. Specifically, we fool halfspaces with 
error e and seed length s = k ■ logn = 0(logn • log^(l/e)/e^). 

Our approach combines classical tools from real approximation theory with struc- 
tural results on halfspaces by Servedio (Comput. Complexity 2007). 
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1 Introduction 



Halfspaces, or threshold functions, are a central class of Boolean functions h : { — 1, +1}" 
{ — 1, +1} of the form: 

h{x) = sign(wiXi + ■ ■ ■ + WnXn — 0), 

where the weights Wi, . . . ,Wn and the threshold 6 are arbitrary real numbers. These functions 
have been studied extensively in a variety of contexts. In computer science, the work on 
halfspaces dates back to the study of switching functions, see for instance the books [Der65, 
Hu65, LC67, She69, Mur71]. In computational complexity, much effort has been put into 
understanding constant-depth circuits of halfspaces. On the one hand this has resulted 
in surprising inclusions (such as the simulation of depth-d circuits of halfspaces by depth- 
{d + 1) circuits of majority gates [GHR92, GK98]), but on the other hand many seemingly 
basic questions remain unsolved: for instance it is conceivable that every function in NP is 
computable by a depth-2 circuit of halfspaces [HMP+93, Kra91, KW91, FKL+01]. In learning 
theory, the problem of learning an unknown halfspace has arguably been the most influential 
problem in the development of the field, with algorithms such as Perceptron, Weighted 
Majority, Boosting, and Support Vector Machines emerging from this study. Halfspaces 
(with non-negative weights) have also been studied extensively in game theory and social 
choice theory, where they are referred to as "weighted majority games" and have been 
analyzed as models for voting, see e.g., [Pen46, Isb69, DS79, TZ92]. 

In this work we make progress on a natural complexity-theoretic question about halfs- 
paces. We construct the first explicit pseudorandom generators G : {—1, +1}'^ — > {^1) +1}" 
with short seed length s that fool any halfspace h : { — 1, +1}" — ^ {—1, +1}, i-e. satisfy 

I E^g{_i,+i}4/i(G(a;))] - Ea;6{_i,+i}n[/i(x)]| < e, 

for a small e. We actually prove that the class of distributions known as /c-wise independent 
has this "fooling" property for a suitable k; as pointed out below, a generator can then be 
obtained using any of the standard explicit constructions of such distributions. 

Definition 1.1. A distribution V on { — 1,+!}"' is k-wise independent if the projection of 
V on any k indices is uniformly distributed over {— 1, 

Theorem 1.2 (Main). Let P be a fc-wise independent distribution on {—1,-1-1}"', and let 
h : { — 1, +1}"' — > {—1, +1} be a halfspace. Then T> fools h with error e, i.e., 

C 

I E:j:^v[h{x)] - E^^u[h{x)]\ < e, provided k > ^log^ 

where C is an absolute constant and U is the uniform distribution over { — 1, -|-1}". 

Our Theorem 1.2 matches up to logarithmic factors a lower bound by Benjamini, Gurel- 
Gurevich and Peled [BGGP07] establishing that ii k = o(l/(e^ ■ log(l/e))) then there exists 
a /c-wise independent distribution V on { — 1,-|-1}" such that Pix^vi^iXi > 0] > 1/2 -|- e, 
i.e. V does not fool the majority function with error e. 
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Standard explicit constructions of k-wise independent distributions over { — 1, +1}" have 
seed length 0{k ■ logn) [CG89, ABI86], which is optimal up to constant factors [CGH+85]. 
Plugging these in Theorem 1.2, we obtain explicit pseudorandom generators G : {—1, +1}* — >■ 
{— that fool any halfspace h : {— 1,+1}" —>■ { — 1,+1} with error e and have seed 
length s = 0(logn ■ log^(l/e)/e^). 

Background and comparison with previous explicit generators. The literature is 
rich with explicit generators for various classes, such as small constant-depth circuits with 
various gates [AW, Nis91, LVW93, Vio07, Baz07, BraOQ], low-degree polynomials [NN93, 
AGHP92, BV07, LovOS, Vio08], and one-way small-space algorithms [Nis92]. Many of 
these classes (such as low-degree polynomials and AC'' circuits) provably cannot imple- 
ment halfspaces, and it is not known how to implement an arbitrary halfspace in any 
of these classes, so none of these results gives Theorem 1.2. However, some of these re- 
sults [Nis92, LVW93, Vio07] give generators for the restricted class of halfspaces given by 
h(x) = sign(^"^^ WjXj — 9) where the weights are integers of magnitude at most poly(n). 
While it is well known that every halfspace has a representation with integer weights, in 
general it is not possible to represent an arbitrary halfspace with poly(n) integer weights. 
Indeed, an easy counting argument (see e.g. [MT94, Has94]) shows that if the weights are 
required to be integers then almost all halfspaces require weights of mag nitude 2^("), and 
in fact some halfspaces require weights of mag nitude 2®('^^°g'^) [Has94]. Our result is for 
the entire class of halfspaces with no restriction on the weights, and much of the richness 
of halfspaces only comes in this setting; for example, the "odd-max-bit" function [Bei94], 
the "universal halfspace" [GHR92] , and other important halfspaces [Has94] all require expo- 
nentially large integer weights. Moreover, even for the restricted class of halfspaces where 
the weights are integers of magnitude at most poly(n), previous techniques [Nis92] give seed 
length s = O(log^n) at best, while we achieve s = O(logn) for constant error. 

Other related results. Several recent papers have studied the power of fc-wise indepen- 
dent distributions. An exciting recent result of Braverman [Bra09], which builds on an earlier 
breakthrough of Bazzi [Baz07] (simplified by Razborov [RazOS]), shows that polylog(?T,)-wise 
independent distributions fool small constant-depth circuits, settling a conjecture of Linial 
and Nisan [LN90]. Benjamini et al. [BGGP07] showed that any 0(l/e^)-wise independent 
distribution V on { — 1,+1}" satisfies | Pra.<_x>[X]i > 0] — 1/2| < e, i.e., such distributions 
fool the majority function. (We discuss [BGGP07] in more detail shortly. Here we note that 
their result does not seem immediately relevant for constructing generators, because to fool 
the majority function, with optimal error 0, one can just output 1" with probability 1/2 and 
(— 1)" with probability 1/2.) None of these results applies to general halfspaces. 

The problem of constructing pseudorandom generators for halfspaces has been considered 
by several authors in the recent literature. Rabani and Shpilka give an explicit construc- 
tion of an e-net, or e-hitting set, for halfspaces [RS08]: a set of size poly(n, 1/e) which is 
guaranteed to contain at least one point where h{x) = +1 and at least one point where 
h{x) = —1 for any halfspace h which takes on both values with probability at least e under 
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the uniform distribution. However, their construction does not offer any guarantees about 
the distribution of these values. [RS08] pose as a research goal "to build methodically a 
theory of pseudorandom generators for geometric functions" such as halfspaces. 

The problem of pseudorandom generators for halfspaces also arose in recent work by 
Gopalan and Radhakrishnan [GR09] on finding duplicates in a data stream. They required 
a pseudorandom distribution that allows one to estimate the influence of a variable in a 
halfspace, a problem which is in fact equivalent to constructing a pseudorandom generator 
for a related halfspace. They observe that Nisan's space generator [Nis92] suffices for the 
halfspaces arising in their context, but they raise the problem of constructing pseudorandom 
generators for general halfspaces. Our result does not improve the space bounds for their 
problem, but it does make the analysis simpler. 

1.1 Techniques 

Our proof combines tools from real approximation theory with structural results regarding 
halfspaces. An important notion is that of an e-regular halfspace; which is a halfspace h{x) = 
sign(^ ■ WiXi — 6) where no more than an e-fraction of the 2-norm of its coefficient vector 
{wi, . . . ,Wn) comes from any single coefficient Wi. We first show that fc-wise independence 
fools all e-regular halfspaces, and then use this to prove that k-wise independence fools all 
halfspaces. Our proof can be broken down conceptually into three steps. 

Step 1: Fooling regular halfspaces. Our starting point is Bazzi's observation ([Baz07], 
Theorem 4.2) that to establish that every fc-wise independent distribution on { — !,+!}" 
fools a Boolean function / : { — 1,+!}" {— with error e, it is sufficient to exhibit 
two "sandwiching" polynomials q£,qu '■ { — 1,+!}" — ^ of degree at most k such 

that: 

• Quix) > f{x) > qe{x) for all x G { — 1, +1}"; and 
. Eu[quix) - fix)],Eu[fix) - Qiix)] < e. 

Using classical tools from real approximation theory, we give a self-contained proof of 
the existence of univariate polynomials of degree K{e) := 0(l/e^) which, roughly speaking, 
provide a good sandwich approximator to the univariate function sign(t) under the normal 
distribution on R. This is useful for us because of the following simple but crucial insight: 
for any regular halfspace h{x) = sign(w ■ x — 6), the argument w ■ x — 6 is well-approximated 
by a normal random variable (a precise error-estimate for this approximation is given by 
the Berry-Esseen theorem). For any e-regular halfspace, we can thus plug w ■ x — 9 into our 
univariate polynomials, and obtain low-degree sandwich polynomials for h. This establishes 
that K{e)-Wise independence fools all e-regular halfspaces. 

Of course, there are halfspaces sign(w ■ x — 9) that are far from being e-regular and have 
w-x — 9 distributed very unlike a Gaussian. To tackle general halfspaces, we use the notion of 
the e-critical index of a halfspace, which was (implicitly) introduced in [Ser07] and has since 
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played a useful role in several recent results on halfspaces [OS08, MORS09, DS09]. Briefly, 
assuming that the weights Wi, . . . ,Wn are sorted by absolute value, the e-critical index is the 
first index i so that the weight vector {wi,Wi^i, . . . ,Wn) is e-regular. The previous Step 1 
handled halfspaces that are regular, corresponding to i = 1. We now proceed by analyzing 
two cases, based on whether or not 1 < i < L{e), or i > L{e), for L(e) := 0(l/e^). In both 
cases, it is convenient to think of the variables as partitioned into a "head" part consisting the 
first L(e) variables and corresponding to the largest weights, and of a "tail" part consisting 
of the rest. 

Step 2: Fooling halfspaces with small critical Index (i < L{e)). We argue that for 
every setting of the head variables, the e-regularity of the tail is sufficient to ensure that the 
overall halfspace gives the right bias. More precisely, we assume that our distribution V is 
{K{e) + L(e))-wise independent, and note that each setting of the i head variables gives an 
e-regular halfspace sign(w ■ x — 9') over the tail variables (with the constant 9' depending 
on the values of the head variables). Since the marginal distribution on the tail variables is 
K{e)-wise independent for every setting of the head variables, the distribution T) fools all 
such halfspaces. 

Step 3: Fooling halfspaces with large critical index > L{e)). In this case, we 
show that the setting of the head variables alone is very likely to determine the value of the 
function. More precisely, we show that a uniform random assignment to the head variables 
is very likely to yield a halfspace sign(i/7r ■ xt — 9') over the tail variables T in which 

l^'l > Wwrh/e. W 

Now, as long as the tail variables are pairwise independent, by Chebyshev's inequality it 
follows that the value wt ■ xt will be sharply concentrated within [— Hwrlh, +||'U^t||2]- So, 
for most settings of the head variables, we get something very close to a constant function 
over the tail variables. Since a {K{e) + 2)-wise independent distribution gives us uniform 
randomness for the head variables and pairwise independence for the tail variables, bounded 
independence fools these halfspaces as well. 

The key idea behind the proof of (*) is that up to the critical index I - which in this case 
is large > L{e)) - the weights (wi, . . . , wg-i) must be decreasing fairly rapidly; this allows 
us to prove strong anti-concentration for the distribution of 9' , which in turn yields (*). 

Overall, the amount of independence required for all the three steps to work is: 
max{K(e), K{e) + L(e), K{e) + 2} = ©(l/e^), 
concluding this sketch of the proof of Theorem 1.2. 

Univariate approximations to the sign function. As mentioned above, our approach 
relies on the existence of low-degree univariate sandwich approximators to the sign function 
under the normal distribution on R. Low-degree approximations to the sign function have 
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been studied in both computer science and mathematics (see for instance [Pat92, EY07, 
KS07] and the references therein). However it appears that these results do not fit all our 
requirements. Below we discuss how our approach relates to the work of Benjamini et al. 
[BGGP07] and Eremenko and Yuditskii [EY07]. 

Benjamini et al. prove that 0(l/e^)-wise independence suffices to fool the majority func- 
tion, using machinery from the theory of the classical moment problem. However, their proof 
seems to be tailored quite specifically to the majority function, where the moments can be 
understood in terms of Krawtchouk polynomials and known bounds on such polynomials 
can be applied, so it seems difficult to extend their approach to general halfspaces (or indeed 
even to slight variants of the majority function). 

Bazzi's condition on the existence of degree-/c sandwiching polynomials mentioned above 
is in fact both necessary and sufficient for all /c-wise independent distributions to fool a 
function /. Thus the [BGGP07] theorem implies the existence of 0(l/e^)-degree multivariate 
sandwich polynomials for the majority function; symmetrization then implies that there exist 
univariate polynomials which, roughly speaking, provide good sandwich approximation to 
the function sign(t) under the binomial distribution. This is similar in spirit to the result 
we establish (mentioned in Step 1 above) about univariate polynomial approximators, but 
there is a crucial difference: since the binomial distribution is supported only on the integers 
{— n, . . . , n}, it seems difficult to infer much about the behavior of the univariate polynomial 
imphcit in [BGGP07] on values outside of {— n, . . . ,n}. Hence, it is unclear whether these 
polynomials can be used for general (or even regular) halfspaces. 

In contrast, we work with the best possible pointwise approximation to the function 
sign(t) on the (piecewise) continuous domain [—1, —a] U [a, 1]. This uniform error bound is 
convenient for dealing with regular halfspaces; moreover, working with the optimal pointwise 
approximator allows us to exploit various properties of optimal approximators that follow 
from the theory of Chebyshev approximation, in a way that is crucial for us to obtain the 
required "univariate sandwich approximators." 

We note that a recent work in approximation theory [EY07] analyzes the error achieved 
by this optimal polynomial and in particular establishes the limiting behavior of the error. 
For our purposes, though, we require the error to converge to the limit fairly rapidly and it 
is unclear whether the results of [EY07] guarantee this. We present an error analysis which 
is elementary (it only uses basic approximation theory) and moreover matches the limiting 
bounds of [EY07] up to a constant factor. 

Organization. In Section 3 we show how a certain univariate polynomial approximator 
to sign(t) yields low-degree sandwich polynomials for e-regular halfspaces over { — 1, 1}". In 
Section 4 we construct the required univariate polynomial, which essentially gives sandwich 
polynomials for sign(t) under the normal distribution. In Section 5 we show how non-regular 
halfspaces can be fooled using our results for regular halfspaces, concluding the proof or our 
main theorem. 
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2 Preliminaries 

Recall that the univariate function sign(t) takes value +1 for t > and —1 for t < 0. 

Definition 2.1 (Halfspace). A halfspace is a Boolean function / : { — 1,+1}" { — 1,+!} 
which can be expressed as f{x) = sign WiXi — 9) for some 9 eH, (wi, . . . , Wn) G R". 

Throughout this paper we assume without loss of generality that halfspaces are normal- 
ized to satisfy wl + - ■ ■+w'^ = 1. Such a representation can always be obtained by appropriate 
scaling. 

Definition 2.2 (Fooling a Function Class). Let / : {—1, +1}" +1} be any function. 

We say that a distribution V over { — 1, +!}"■ fools f with error e, or e-fools /, if 

\E,^r>[f{x)]-E,^u[f{x)]\<e, 

where U denotes the uniform distribution over {—1,+!}'^. We say that V fools a class of 
functions JF if P fools every j E T . 

We require a few basic facts from probability theory: the Berry-Esseen theorem and the 
standard tail bounds of Hoeffding and Chebyshev. We discuss them next. 

The Berry-Esseen theorem is a version of the Central Limit Theorem with explicit error 
bounds: 

Theorem 2.3. (Berry-Esseen) Let Xi, . . . ,X„ he a sequence of independent random vari- 
ables satisfying E[Xi] = for all i, \/^~E\Xf\ = a, and ^iE[|Xi|^] = p^. Let S = 
[Xi + ■ ■ ■ -|- Xn)/(y and let F denote the cumulative distribution function (cdf) of S. Then 

sup|F(x) -<l>(x)| <Cp3/a^ 

X 

where $ is the cdf of a standard Gaussian random variable (with mean zero and variance 
one), and C is a universal constant. [Shi86] has shown that one can take C = .7915. 

Corollary 2.4. Let Xi, . . . ,Xn denote independent uniformly ±1 random signs and letwi,...,Wr, 
R. Write a = and assume \wi\/a < t for all i. Then for any interval [a, b] C R, 

|Pr[a < w^xi + ■ ■ ■ -f WnXn <b]- <l>([f , ^])| < 2r, 

where $([c, rf]) := ^{d) — $(c). In particular, 

\h — ci\ 

Pr[a < WiXi + ■ ■ ■ + WnXn < b] < h 2r. 

a 

For completeness we recall the Hoeffding and Chebyshev bounds: 
Theorem 2.5 (Hoeffding). Fix any w G R". For any ^ > 0, we have 

Fr [w ■ X > 7||w||] < e~'''^^'^ and Ft [w ■ x < — 7||w||] < e""^^^^. 

x<—U x^U 

Theorem 2.6 (Chebyshev). For any random variable X with E[X] = n and Var[X] = cr^ 
and any k > 0, 



FT[\X-i,\>ka]<^. 
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3 Fooling regular halfspaces 

In this section we show how to fool regular halfspaces, defined next (recall all our halfspaces 
are normalized to satisfy + ■ ■ ■ + w^^ = 1). 

Definition 3.1 (Regular Halfspace). A halfspace / is said to be e-regular if it can be 
expressed as f{x) = sign(w ■ x — 9) where for alH = 1, . . . , n, we have 

\wi\ < e. 

An e-regular halfspace f{x) = sign(w ■ x — 6) has the convenient property that the 
cumulative distribution function (cdf) of w ■ x — 6 is everywhere within ±0(e) of the cdf of 
the shifted Gaussian N{—6, 1). This is a direct consequence of the Berry-Esseen Theorem. 

Given e > 0, we define the following parameters: 

,^ 4clog(V£) ^ 2 < 5£ = o {log^dM/e') ^ 

a a 

We assume without loss of generality that e is a sufficiently small power of 2 (i.e., e = 2~* for 
some integer i). The positive constants C and c will be chosen later; but (with foresight), 
we will require that C ^ c. 

In this section we prove the following: 

Theorem 3.2 (Fooling e-regular halfspaces). Any K{e)-wise independent distribution fools 
e-regular halfspaces with error 12e. 

To prove the theorem we construct certain "sandwiching" polynomials. We now define 
such polynomials and then explain why they are sufficient for our purposes. 

Definition 3.3. Let / : { — 1, +1}" +1} be a Boolean function. A pair of real-valued 

polynomials qi{xi, . . . , x„,), g„(xi, . . . , Xn) are said to be e-sandwich polynomials of degree k 
for f if they have the following properties: 

• deg(g„),deg(g£) < k; 

• Quix) > f{x) > qe{x) for all x G { — 1, 

• E^^u[qu{x) - f{x)] < e and E^^ulfix) - qeix)] < e. 

The following fact relates sandwiching polynomials to fooling: 

Lemma 3.4 (Bazzi). Let f : { — 1,+!}" { — 1,+!} be a Boolean function. Every k-wise 
independent distribution e-fools f if and only if there exist e-sandwich polynomials of degree 
k for f. 
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Figure 1: Qualitative plot of polynomial P. 



We only use the "if" direction of this lemma for our proof, which follows straightforwardly 
by linearity of expectation. The other direction is a consequence of LP-duality (see [Baz07] 
for a proof). 

To construct appropriate sandwiching polynomials, we start by exhibiting a univariate 
polynomial P : TZ ^ TZ that approximates the function sign : TZ { — 1,1} in a certain 
specialized way which we discuss shortly. Let us be given an e-regular halfspace h{x) = 
sign{w ■ X — 6) , and assume that 16*1 is small (the case where 16*1 is large is simpler). We obtain 
the upper sandwiching polynomial g„ in Definition 3.3 by plugging into P the value w - x — 6 
scaled by a large Z > 0: 



For the lower sandwiching polynomial qe we will use —P{—t). The key properties of P{t) 
are that (1) P(t) > sign(t) for every t E 71, (2) P(t) gives a good (error e) pointwise 
approximation to sign(t) for t G [—1/2, 1/2] except for t in the small interval [— 2a, 0] where 
the error is bounded by a constant, and (3) P{t) does not grow too quickly for |t| > 1/2. 
For a qualitative depiction of P we refer the reader to Figure 1 (this figure is not an actual 
plot but rather is intended to qualitatively illustrate the guarantees on the behavior of P 
on various intervals; also the parameter 1/2 is replaced by 1 — a > 1/2 for later needs). 
Property (1), together with the fact that scaling by Z does not change the value of the 
halfspace, immediately gives the sandwiching property qu{x) > h{x) in Definition 3.3. To 
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establish the small-error property Ex[qu{x) — h{x)] in Definition 3.3, we reason by case 
analysis. First, by our choice of parameters, the small interval [—2a, 0] remains small even 
after scaling by Z, and so we use an anti-concentration argument to show that the input 
t = [w ■ X — 6)/Z to the polynomial is unlikely to land there, and thus the contribution 
towards the error Fix[qu{x) — h{x)] is negligible in this case. Also, whenever the input t to 
the polynomial lands in [—1/2, 1/2] \ [—2a, 0], the contribution to the error Fix[qu{x) — h{x)] 
is small because by Property (2) the polynomial approximates the sign function well there: 
Quix) — h{x) < e. Finally, the event that the input t = {w ■ x — 6)/Z to P has absolute value 
bigger than 1/2 corresponds to the event that \w-x — 9\ > Z/2. The scaling factor Z is large 
and the halfspace is e-regular, and so we can apply standard tail estimates to bound from 
above this probability. Since by Property (3) the polynomial P(t) does not grow too quickly 
for |t| > 1/2, the contribution to the error Ex[qu{x) — h{x)] is small even in this case. 

We now proceed with the formal proof. We start with recording in the following theorem 
the properties of P. 

Theorem 3.5. Let < e < 0.1 and let a and K be as defined above. There is a univariate 
polynomial P{t) such that deg(P) < K with the following properties: 



(1) P{t) > sign(t) > -P{-t) for all t e R; 

(2) P{t) e [sign(t), sign(t) + e] for t G [-1/2, -2a] [J[0, 1/2]; 

(3) P{t) G [-1, 1 + e] for t G {-2a, 0); 

(4) \P{t)\ < 2 ■ (4t)^ for all \t\ > 1/2. 

We defer the proof of Theorem 3.5 to Section 4 and we proceed with the proof of Theorem 



3.1 Proof of Theorem 3.2 

Let h{x) = sign(w ■ x — 9) be an e-regular halfspace (and recall wf + ■ ■ ■ + w"^ = 1.) Let 

^_ e Clog(l/e) 



We break the analysis into the following two cases, based on the magnitude of the threshold 

e. 

3.1.1 1^1 is small (|^| < Z/A) 



3.2. 



2a 



2e 



The sandwich polynomials we use are: 



qu{x) := P 



( 



w ■ X — 6 



Z 



) 



qi{x) := -P 



6 — w ■ X 



Z 



) 



(1) 
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First, observe that for every x G {— 1,+1}" we have 

Quix) > h{x) > qi{x). 
This is because from Theorem 3.5 with t = {w ■ x — 6)/Z we get 

f w • X — 6\ 

Quixj > sign I — I = sign(tf; ■ x — 0) = h[x) > qi[x). 

In the rest of this section we bound the error of the approximation. 
Lemma 3.6. 'Fir^[qu{x) — h{x)] < lOe. 

Proof. Define the random variable H{x) = {w - x — 9)/Z. We prove the desired upper bound 
by partitioning the space into three events and bounding the contribution from each: 

1. 5*1 is the event that H{x) G [—e/Z, 0]. 

2. 5*2 is the event that |-ff(a;)| < 1/2, but 5*1 does not happen. 

3. 5*3 is the event that > 1/2. 
We have 

3 

E.[g„(x) - h{x)] = V Pr[5,] E,[g„(x) - h{x)\Si]. 

i=l 

Case 1: In this case, the pointwise error is moderate - at most (2 + e) - and we use 
gaussian anti-concentration to argue that the event has small probability mass. The event 
H{x) G [— e/Z, 0] implies that 

w ' X — 

G [-2a, 0] qu{x) < 1 + e ^ qu{x) - h{x) < 2 + e, 

using Item (3) in Theorem 3.5. 

Since h is e-regular, from Corollary 2.4 it follows that Ptx[H{x) G [—e/Z, 0]] < 3e. So, 

Pr[^i] E,[g„(x) - h{x)\Si] < (2 + e) ■ 3e < 8e. 

X 

Case 2: This event has high probability, but in this range we get good pointwise 
approximation. The event 5*2 implies that 

G [-1/2, 1/2] \ [-2a,0] < + e qu{x) - h{x) < e, 

where we used Item (2)in Theorem 3.5. So, 

Pr[^2] ^x[qu{x)-h{x)\S2]<l-e<e. 

X 
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Case 3: Here we trade off the large magnitude of error (Item (4) in Theorem 3.5) with 
the small probability of the event (bounded by the Hoeffding bound). Define the intervals 

J (j + 1) 
2' 2 



for J = 1,2 
-(fc+1) -k 



for k = 1,2, 



2 2 
We can write 

Pr[^3] E,[g„(x) - h{x)\S3] =^Pr[if(a;) G J+] E,[g,(x) - hix)\H{x) G /+] 



^Pr[i7(x) G E.[g.(x) - /i(a;)|i7(x) G /, 



k>l 



Fix any integer j > 1. If H{x) E then 

2 - < 

Recalling that we have \P{t) \ < 2 ■ (4t)^ for t > 1/2, we get that 

qu{x) = P{H{x))<2{2j + 2f. 

Since /i(x) = 1, we get 

qu{x) - h{x) = q{x) - 1 < 2(2j + 2f - 1. 
Next we bound VixiH^x) G L^] using the Hoeffding bound. 



¥i[H{x) G /+] < Pr 



w ■ X — 6 > 



JZ 



< Pr 

X 



W ■ X > 



< e 



(2) 



(3) 



(4) 



where the second inequality uses the fact that \9\ < Z/A. 

The analysis of the intervals is similar (except h{x) = —1). For H{x) G we get 



\H{x)\ < 



k + 1 



qu{x) < 2{k + l) 



K 



qu{x) - h{x) < 2(2A; + 2)^ + 1 



Similarly, the Hoeffding bound gives 

Fr[H{x) G < Pr wx-O < 



-kZ 





-kZ' 


< Pr 


w ■ X < — : — 


X 


~ 4 



< e 



-fc2z2/32 



(5) 



(6) 



11 



Plugging equations (3), (4), (5), (6) back into (2), we get 

PjlSs] E,[g„(x) - h{x)\Ss] < ^ -^^^ + ^ -^^^^ 

j>i k>i 

i>i 

where the last inequality follows by noting that, for j > 1, (2j + 2)^ < e^^^ and e?'^^'^!'^'^ > 
gi-^^/32_ j-^Q^ observe that 

32 e2 V 128 

For a suitable choice of C ^ c, we have that 10c — C/128 < —1, so 

Pr[53]E,[g„(x)-/i(x)|53] <4Ve- 

i 

Thus overall, we have E2.[g„(x) — < lOe. □ 

The lower sandwich bound follows by symmetry: 
Lemma 3.7. Ea,[/i(x) - qi{x)\ < lOe. 

Proof. Since qi{x) < h{x) for every x, we also have —h{x) < —qi{x). Thus 

'6* — ■ a;' 



"^^^^ < e. 



-gi(x) = P 

is an upper sandwich for the function —h{x) = sign(6' — w ■ x). As this does not change the 
magnitude of 6, we can apply the analysis of Lemma 3.6 to conclude that 

E.[/i(x) - qi{x)] = E,[-qi{x) - {-h{x))] < lOe. 

□ 

3.1.2 1^1 is large (|^| > Z/4) 

We assume for simplicity that 6 > Z/A (the case when 6 is negative is handled similarly). 
The sandwich polynomials we use are: 

r.(x) = P f^^^^^^^V n{x) = -l. (7) 
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Lemma 3.8. h{x) > ri{x) for all x E {—1, +1}". Further, Eix[h{x) — ri{x)] < 2e. 

Proof. Note that Ej.[/i(x) — r;(x)] = 2PTx[h{x) = 1]. For large enough C we have PTx[h{x) = 
1] =Fr^[w x>e] < e-^'/32 □ 

Lemma 3.9. ru{x) > h{x) for all x G { — 1, +!}"• Further, 'Eix[ru{x) — h{x)\ < 12e. 

Proof. Observe that ru{x) is the upper sandwich polynomial for the halfspace h'{x) = sign(w- 
X — Z/4) as specified in Section 3.1.1. Thus we have 

ru{x) > h\x) > h{x) 

hence 

E.K(x) - h{x)] = E,[r„(x) - h'{x)] + Ex[h'ix) - h{x)]. 

By Lemma 3.6, Er^[ru{x) — h'{x)] < lOe whereas by the Hoeffding bound E^[/i'(x) — /i(x)] < 2e 
which completes the proof. □ 



4 Proof of Theorem 3.5 

This section contains our proof of Theorem 3.5. The key step is to exhibit a low-degree 
univariate polynomial that approximates sign(t) well when \t\ G [a, 1] and is well-behaved 
even for larger values of |t| to be compatible with the sandwich condition. We phrase this as 
a problem in univariate approximation. The solution we use is a low-degree polynomial p(t) 
which is an optimal pointwise approximator to sign(t) on [—1, —a] U [a, 1]. Such an optimal 
polynomial exists and we prove that it is well-behaved for large using ideas from classical 
approximation theory. However, it seems difficult to construct this polynomial explicitly and 
bound its error. 

Recent work by [EY07] analyzes the error achieved by such a polynomial and in particular 
establishes the limiting behavior of the error function. For our purposes, though, we require 
the error to converge to the limit fairly rapidly and it is unclear whether the results of [EY07] 
guarantee this. 

Instead, we bound the error by constructing a small error approximator q{t) using Jack- 
son's theorem together with standard amplification ideas. While q(t) might not be well- 
behaved for large value of t, we only use it to bound from above the error of p{t) on 
[— 1, — a] U [a, 1]. Our approach has the advantage of being self-contained and elementary 
(using only standard ingredients from basic approximation theory) and matches the limiting 
bounds of [EY07] up to a constant factor. 

For a bounded continuous function / : [—1, 1] R, we define its modulus of continuity 
ujf{S) as 

ujf{6) := sup{|/(x) - f{y)\ : x,y e [-1, 1]; \x - y\ < 6}. 

A classical result of Dunham Jackson from the early twentieth century bounds the error of 
the best degree-^ approximation to /. 
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Theorem 4.1. (Jackson's Theorem) [Che66] For f as above and any integer i > 1, 
there exists a polynomial J(t) with deg(J) < i so that 

max \J(t) - f(t)\ < Qujf 

2 

Recall the parameter a = ^^^^^-^^^-^ from the previous section. We now define the following 
parameter: 

clog(l/e) 
m := . 

a 

It will be crucial for us that m is even (see in particular the last paragraph of the proof 
of Theorem 4.5.); for this condition to be satisfied, it is of course enough that c is even. (We 
also note that the parameters K and m are such that K = 4m + 2.) 

Lemma 4.2. For a, m as above, there is a polynomial q(t) of degree at most 2m such that 

max \q{t) — sign(t)| < e^. 

\t\£[a,l] 



Proof. Define the continuous and piecewise linear function /(x) as 

fix) 



sign(t) a < |t| < 1 
t/a \t\ < a. 



Thus f{x) increases linearly from —1 to 1 in the range [—a, a]. A simple calculation shows 
that ujf{j) = l/{ai). Taking i > 25/a, Jackson's theorem implies the existence of a polyno- 
mial J{t) of degree at most i such that 

max I J(t) - sign(t)| < max | J(t) - /(t)| <-.<]■ 
a<\t\<i t6[-i,i] at 4 

Our goal is to bring the error down to e^. Rather than using Jackson's theorem for this 
(which would require degree 0(e~'^)), we use the degree- /c amplifying polynomial 

/I n,\ 



This polynomial has the following properties (easily proved via elementary calculation 
and also following from the Chernoff bound): 

Claim 4.3. The polynomial A^^iu) satisfies: 

1. If u e [3/5, 1], then 2Ak{u) - 1 G [1 - 2e-*^/^ 1]. 

2. Ifu e [-1, -3/5], then 2Akiu) - 1 G [-1, -1 + 2e-'=/6]_ 
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We define the polynomial 



q{t) := 2Ak ( ^ J(t) 



where k = 151og(l/e). Scaling J{t) by | ensures that the argument to lies in the range 
[—1, —3/5] U [3/5, 1] whenever \t\ < a. Applying Claim 4.3 with k = 151og(l/e) gives 

max \q(t) - sign(t)| < 26"'=/^ < 
|t|e[a,i] 

Finally, by selecting c large enough, we have 

25 2c 
deg(g) < deg( J) deg(v4fc) < — • 151og(l/e) < — log(l/e) = 2m. 

Qj (X 

□ 

We now present the "well-behaved" polynomial p{t) mentioned at the beginning of this 
section. We will use Chebyshev's classical theorem on (weighted) real polynomial approxi- 
mation ([Ach56], Chapter II). 

Theorem 4.4. (Chebyshev's Theorem) /L4c/i5^y Let / : [a, 6] R 6e a continuous func- 
tion. Let s : [a, 6] — >■ R 6e a continuous function that does not vanish on [a, h] . The 
polynomial r{z) of degree m that minimizes 

M{m) = max \f{z) — s{z)r{z)\ 

te[a,b] 

is unique, and it is characterized by the property that there exist m + 2 points a < zq < 
Z\ - ■ ■ < Zm+i < b such that for each Zi 

M{m) = \f{z,)-s{z;)r{z,)\ 

and the sign of the error at the Zi 's alternates. 

Theorem 4.5. Let a and m be as specified in Section 3. There is a univariate polynomial 
p(t) where deg{p) < 2m + 1 such that: 

1. p(t) G [sign(t) - e^, sign(t) + e^] for all \t\ G [a, 1]; 

2. pit) e [-(1 + e^), 1 + e^] for all t e [-a, a]; 

3. pit) is monotonically increasing on the intervals (— oo, — 1] and [l,oo). 

Proof. The polynomial p{t) is a best uniform approximation (such an approximation is guar- 
anteed to exist [Riv74]) to the function sign(t) of degree at most 2m + 1 over the domain 
[—1, —a] U [a, 1]. Applying Lemma 4.2, we get 

max |p(t) — sign(t)| < max |g(t) — sign(t)| < 

|i|G[a,l] |i|6[a,l] 
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which gives Property (1). 

We can assume that p(t) is odd (by replacing it with {p(t) — p{—t))/2 if needed). So we 
can write p{t) = t ■ riiP'), where r{z) is a polynomial of degree m that minimizes 

M{m) = min sup \1 — \fzr{z)\. 

r: dcg{r)<m ^g[^2^i] 

Invoking Chebyshev's theorem with f{z) = 1 and s{z) = ^/z (which does not vanish 
on [a^, 1]), we infer that the optimal polynomial r{z) of degree m is unique and it has an 
alternating sequence of points 

a"^ < Zo < Zi . . . < Zm+l < 1 

so that the error 1 — ^/zr{z) achieves its maximum magnitude exactly at the points Zi, and 
the sign of the error alternates. 
Set ti = ^/zi > so that 

a < to < ^1 • • • < tm+i < 1- 
Let 0(t) be the error function (pit) = p(t) — sign(t). Note that for t > a, we have 

0(t)=p(t)-l, 
0(-t) = pi-t) - (-1) = -Pit) + 1 = -0(t). 

For each ti, we have 

\m\ = \<l>i-U)\=M{m). 

Now consider the interval [a, 1], on which (pit) = pit) — 1. Note that (p'it) is well defined 
and equals p'it) at any point in (a, 1). The points ti, . . . ,tm lie in (a, 1) and they are local 
maxima/minima, since (pit) cannot increase in magnitude in the neighborhood of ti. Thus 
4''{'ti) = P'iti) = for each i G [m]. Similarly, we can show that (p'i—ti) = p'(— tj) = for 
i G [m]. But deg(p') is at most 2m, and so we have located all its roots. As we now show, 
this allows us to determine the sign of p in the intervals [— oo, —1], [—a, a] and [1, oo]. 

Note that p(ti) is close to 1 whereas p(— ti) is close to —1, and thus p increases mono- 
tonically in the interval (— which includes [—a, a]. Also ti is a local maximum for p, 
which shows that the tjS are maxima when i is odd, and minima when i is even. Thus, since 
m is even, pitm) is a local minimum, so pit) increase monotonically in the range (tm,oo), 
which includes [1, oo). Whereas — ti is a local minimum for p, so pi—ti) are local minima for 
odd i and maxima for even i, hence pit) is monotonically increasing in the range (— oo,tm) 
which contains (— oo,— 1]. □ 

Using the polynomial p(t), we now construct the polynomial Pit) which is a good "upper" 
approximator to sign(t) (i.e. P(t) > sign(t) for all t), thus completing the proof of Theorem 
3.5. 

To help the reader visualize pit), we provide a schematic representation in Figure 2. (We 
remark that, as before, this figure is not an actual plot, but rather is intended to qualitatively 
illustrate the guarantees on the behavior of p on various intervals.) 

Let us recall the statement of Theorem 3.5: 
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Figure 2: Qualitative representation of polynomial p. 



Theorem 3.5. (Restated.) Let < e < 0.1 and let a and K be as defined above. There is 
a univariate polynomial P(t) such that deg(P) < K with the following properties: 

(1) P{t) > sign(t) > -P(-t) for all t E R; 

(2) P{t) e [sign(t), sign(t) + e] for t G [-1/2, -2a] U[0, 1/2]; 

(3) P{t) e [-1, 1 + e] for t e (-2a, 0); 

(4) \P{t)\ < 2 ■ (4t)^ for aU \t\ > 1/2. 

Proof. Let p denote the polynomial of degree 2m + 1 from Theorem 4.5. Consider the 
following polynomial: 

P(t) = i(l + e2+p(t + a))2-l. 

Note that deg(P) = 2 deg(p) < K. We now consider the behavior of P on the relevant 
intervals. We repeatedly use the inequality 

^(2 + 26^)2 _ 1 = 1 + ^ 2e^ < 1 + e 

which holds since e < j^. Note that P(t) > —1 holds for all t. We now analyze the behavior 
of P{t) interval by interval: 
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(a) t e [-1 - a, -2a]. Here p{t + a) G [-1 - e^ -1 + e^], hence P{t) G [-1, -1 + e]. 

(b) t E (-2a, 0). Here p{t + a) E [-1 - e^, 1 + e^], hence P{t) E [-1, 1 + e]. 

(c) t e [0, 1 - a]. Here p{t + a) G [1 - e^, 1 + e^], hence P{t) e [1, 1 + e]. 

(d) t e (1 - a, oo]. Here p(t + a) > 1 - e^, hence P(t) > 1. 

This shows that P{t) > sign(t) for all t G R. Thus we also have 

P{-t) > sign(-t) =^ sign(t) > -P{-t) 

which establishes Property (1). Properties (2) and (3) follow immediately from (a), (b) and 
(c) above. 

For Property (d), we use the following standard fact from approximation theory. 

Fact 4.6. [Riv74] Let a{t) be a polynomial of degree at most d for which \a{t)\ < 6 in the 
interval [-1, 1]. Then \a{t)\ < h\2t\'^ for all \t\ > 1. 

Taking a{t) to be P(t/2), properties (2) and (3) give us that \P{t/2)\ < 2 for t e [-1, 1]. 
So the fact gives |P(t/2)| < 2|2t|^'"+2 for \t\ > 1, i.e. \P{t)\ < 2|4t|^™+2 |^| > 1/2. □ 

5 Fooling non-regular halfspaces 

In this section we show how to fool halfspaces that are not regular. We proceed by case 
analysis based on the critical index of the halfspace, which we define shortly. Throughout 
this section we assume that the weights of the halfspace are decreasing: 

l^il > \W2\ . . . > \Wn\- 

We can assume this without loss of generality because we are going to prove that, for a 
suitable k, any A;-wise independent distribution fools such halfspaces, and the property of 
being /c-wise independent is clearly invariant under permutation of the variables. 

Some notation: For T C [n] we denote by ax the quantity ctt := \/X]ieT '^h ^ ^ V'] 
we also write ak for o'{k,k+i,...,n}- 

Definition 5.1 (Critical index). We define the r-critical index £(r) of a halfspace h = 
sign(w ■ X — 9) as the smallest index i E [n] for which 

\Wi\ <T ■ Oi. 

If this inequality does not hold for any i E [n], we define £(r) = 00. 
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Note that a halfspace is r-regular if ^(r) = 1; in this section we handle the case £(r) > 1. 

We assume without loss of generality that e is sufficiently small. Given e, our threshold 
for the critical index is 

m := ^Mf). 

We argue separately depending on whether i{e) > L(e) or not. Both proofs rely on the 
following simple property of /c-wise independent distributions. 

Fact 5.2. Let P be a A;-wise independent distribution over { — 1,+!}". Condition on any 
fixed values for any t < k bits of V, and let T>' be the projection of V on the other n — t 
bits. Then V is {k — t)-wise independent. 

The first theorem addresses the simpler case when i{e) < L{e). 

Theorem 5.3 (Fooling non-regular halfspaces with small critical index). Let h{x) be a half- 
space with e-critical index i{e) < L{e). Then any {K{e) + L{e))-wise independent distribution 
0{e) -fools h. 

Proof. Condition on any setting to the first i—1 variables. Each of these defines a halfspace 
of the form 




where 6' depends on the values assigned to the head. Every such halfspace is e-regular by the 
definition of e-critical index. Also, the conditional distribution on the remaining variables 
is -ft'(e)-wise independent by Fact 5.2. Thus, Theorem 3.2 implies that we fool h' with error 
e. Since both the uniform distribution and V induce the same (uniform) distribution on the 
first i—1 variables, an averaging argument concludes the proof of the theorem. 

□ 

In the rest of this section we study the case of large critical index i{e) > L{e), and prove 
the following theorem. 

Theorem 5.4 (Fooling non- regular halfspaces with large critical index). Let h{x) be a half- 
space with critical index £[e) > L[e). Any {L{e) 4- 2)-wise independent distribution V fools h 
with error 9e. 

To prove Theorem 5.4 we partition the coordinate set [n] into a head H consisting of 
the first L(e) coordinates, and a tail T = [n] \ H consisting of the rest. We then show 
that a random setting of the head variables induces with high probability a partial sum 
YlieH which is so large in magnitude that the values of the tail variables are essentially 

irrelevant, in the sense that they are very unlikely to change the sign of w ■ x — 6 and hence 
the value of the halfspace. 

We will show that this statement holds both for the uniform distribution and for the 
distribution V with limited independence. For the latter we will use that after restricting 
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the variables in the head we still have a 2-wise independent distribution on the tail (by 
Fact 5.2), which is enough for Chebyshev's concentration bound to apply. To show that the 
partial sum is likely to be large we use ideas from [Ser07], in particular that the weights 
decrease geometrically up to the critical index. 



5.1 Proof of Theorem 5.4 

We partition the coordinate set [n] into a head H consisting of the first L(e) coordinates, 
and a tail T = [n] \ H consisting of the rest. Any fixing of the variables in H results in a 
halfspace 



/i'(xr) := sig: 
over the tail variables xt where 



As discussed before, our goal is to show that, for a random setting of the head variables, 
6'^ is likely to be so large in magnitude that the value of the tail sum YlieT '^i^i is unlikely 
to influence the outcome of h{x). The key idea here is the following lemma from [Ser07] 
showing that the weights decrease geometrically up to the critical index. 

Lemma 5.5. For any l<'i<j<^+l we have 

\wj\ < (Jj < (^Vl - Gi < - \wi\/e. 

In particular, if j >i + (4/e^) ln(l/e) then 

\wj\ < \wi\/3. 

Proof. For any k < i, we have by the definition of e-critical index that 



Hence 

Repeating this calculation yields 



2 ^ 2 2 



2 2 2^/1 2\ 2 



of < (1 - 



To conclude the first chain of inequalities in the statement of the lemma, use again af < 
wf/e^ and the obvious inequality > w'j. The "in particular" part can be verified by 
straightforward calculation, using that e is sufficiently small. □ 
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Now consider the set of 

t := log(10/e) 
"nicely separated" coordinates (variables) 

G:={ki:=l + i- (4/6^) ln(l/e) : i = 0,1, . . . ,t - 1} C H. 

Observe that indeed G H because the maximum index in G is at most H-t-(4/e^) log(l/e) < 
(4/e^) log^(10/e), whereas H consists of all the first L(e) = (8/e^) log^(10/e) indices. The 
key features of G are that we can apply the 'in particular" part of Lemma 5.5 and prove the 
following claim. 

Claim 5.6. cry < e|wfetl- 

Proof. By our choice of L{e),t, and k^, we have 

L{e) -h> 81og2(10/e)/e2 - A\og\lO / e) / > \og\l/e)/e\ 
An application of Lemma 5.5 gives 

r- -\og\l/e)/e^ 1/ / 21 1/ I I 

where we use that e is sufficiently small. □ 

We now show that a random setting of H is likely to result in a value of | | which is at 
least |wfcJ/4. The proof relies on the following claim. 

Claim 5.7. Let vi > V2 > ■ ■ ■ > Vt > he a sequence of numbers so that vi^i < Vi/3. Then 
for any two points x ^ y E { — 1, +1}*, we have \v ■ x — v ■ y\ > Vt- 

Proof. Let z := x — y E {-2,0,2}*, which is not zero. Let j < t be the smallest index such 
that Zj 7^ 0. Then 

\v ■ X — V ■ y\ = \v ■ z\ = \ "^^ViZil > \vjZj\ — \viZi\ > 2{vj — Vj) 

i>j i>j i>j 

^ 2(^, - E ^) ^ 2(1;, - v,/2) = vj > Vt, 

i>j 

using Vi < Vj/3^~^ by assumption. □ 

We are now ready to show our intended lemma: 
Lemma 5.8. Pix.-.ieH [\0 - Y.ieH^i^i\ < kfct|/4] < e/10. 
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Proof. Fix any assignment to the variables in H \ G. For this fixing, the event \6 
^ieH'^i^il ^ kA.-i|/4 happens only if 



ieG 



6* - ^ WiXi 
ieH\G 



ieH\G 



i.e., XlieG^'i^* ^^^^^ interval of length \wkt\/2. Applying Claim 5.7 to the weights in 

G, any two possible outcomes of Xliec^*^* differ by at least \wkt\. So there is at most one 
setting Xfc^ = ai, . . . , x^^ = of the variables in G for which this event occurs. This setting 



has probability at most 2 



□ 



With this lemma in hand, we can show that limited independence suffices to fool halfs- 
paces with a large critical index. 

Proof of Theorem 5.4- We compare the behavior of h{x) on V and the uniform distribution 
U. In either case, the marginal distribution for the variables in H is uniform. For each setting 
of these variables, we are left with a halfspace of the form h'{xT) = sigii(X]ieT '^i^i ~ ^'h) 
the variables in T. The combination of Lemma 5.8 and Claim 5.6 shows that with probability 
at least 1 — e/ 10 we have 



\9 - ^^Wi- Xi\ > 



4 - 4e ^ ^ 



We condition on this event {-k). Consider the projections W and V of U and V on xt- 
By Fact 5.2, V is 2-wise independent. We now argue that for both U' and V, it is very 
likely that h'^xr) = — sign(^^) (for small enough e). Indeed if this does not happen, then 
we have 

WiXi\ > \0 - 2_^Wi ■ Xi\ > — — > 



ieH 



4e 



Under the uniform distribution, by a Hoeffding bound (Theorem 2.5), the probability of 
this event is bounded by 



Pr 



> 



(7X 

47 



< 2e 32.2 ^ 4g_ 



While by Chebyshev's inequality (Theorem 2.6) we get 



Pr 



E 



WiXi 



> 



17 



< 16e^ < 4e. 



Thus, we have 



EvWixr)] - Eu'[h'{xT)]\ < 2| Pr[/i'(xT) 



-signie'^)] -PT[h'{xT) 

Li 



-sign(^^)]| <8e. 
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To conclude, our goal was to bound from above | E;^[/i(a;)] — Ex>[/i(x)]|. Using the fact 
that both distributions induce the uniform distribution on variables in if, and conditioning 
on the event {-k), we get 

I Eu[h{x)] - Ev[h{x)] \ < 8e + 2 ■ e/10 < 9e. 

□ 

Our main result. Theorem 1.2, follows immediately from Theorem 3.2, Theorem 5.3 and 
Theorem 5.4. 

5.2 Proof of the main theorem 

For completeness in this section we summarize what is needed to prove our main theorem. 

Theorem 1.2 (Main). (Restated.) Let P be a /c-wise independent distribution on {—1, +1}", 
and let h : { — 1, +1}" +1} be a halfspace. Then V fools h with error e, i.e., 

C 

I E^^Ty[h{x)] - E^^ulKx)] I < e, provided ^ - ^ 

where C is an absolute constant and U is the uniform distribution over { — 1, +1}". 

Proof. Consider the parameters K{e),L{e) defined in Sections 3 and 5, respectively, and 
recall that they are both 0(log^(l/e)/e^). For a given halfspace, consider its critical index 
i. If i < L(e) we apply Theorem 5.3, otherwise we apply Theorem 5.4. □ 

6 Conclusions 

We feel that Theorem 1.2 is of independent interest and may find other applications aside 
from pseudorandomness. For instance, consider the problem of estimating the influence 
of a variable in a halfspace [GR09]. It is easy to verify that for any halfspace h{x) = 
sign(^"^-^ WjXj — 9) the influence of the i-th variable equals Ey^uWiy)], where h' is the 
halfspace defined by h'{y) = sign.(^j j_^-Wjyj — 9yi + Wi). Thus, one can use 0(e~^)-wise 
independence to estimate the influence to within an additive e. Note that, for any halfspace, 
the bias and the influences together are (respectively) the Fourier coefficients at levels 
and 1. They are collectively called the Chow parameters of a halfspace (after a theorem of 
C.K. Chow showing that these numbers uniquely specify the halfspace [Cho61]) and have 
been well-studied in the literature [G0IO6, Ser07, OS08, MORS09]. Our result implies that 
the Chow parameters of a halfspace can be estimated to within accuracy e using bounded 
independence. 

Our results, together with the lower bound of [BGGP07], are essentially optimal in terms 
of characterizing the degree of independence that is required to e-fool halfspaces. However, 
many natural and interesting directions remain for future work. 
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One obvious goal is to construct unconditional pseudorandom generators for halfspaces 
that have a better dependence on e than our construction. The ultimate goal here is to 
achieve the information-theoretic optimal possible seed length, i.e. s = 0(log(n/e)). 

Another natural, though perhaps challenging, goal is to understand the degree of inde- 
pendence that is required to e-fool degree-ci polynomial threshold functions over {—1, +1}". 
For constant e and constant d, does 0(l)-wise independence suffice to fool degree-d PTFs? 
As far as we know nothing is known about this question, even for d = 2. 

A third question that is related to our work is whether it is possible to derandomize 
the problem of approximately counting the number of satisfying assignments for a given 
halfspace. Our results give a single fixed and easily constructible set of n'^^^/'^ ) many points 
which can be used to deterministically obtain a ±e-accurate estimate of PTu[f{x) = 1] for 
any halfspace / in time n'^^^/'^^\ However, there is a deterministic algorithm of [Ser07] which 
takes integer weights and threshold Wi, . . . , Wn, (each poly(n) bits long) as input and runs 
in time poly(n) ■ 2^^^^'^ \ Can a poly(?T,, l/e)-time deterministic algorithm be obtained? 

Acknowledgements. Rocco Servedio thanks Troy Lee for a helpful conversation about 
amplifying polynomials and Adam Klivans for useful conversations about Jackson's Theorem. 
Parikshit Gopalan would like to thank Jaikumar Radhakrishnan and Amir Shpilka for many 
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